Modeling bias and variation in the stochastic processes of small RNA sequencing

نویسندگان

  • Christos Argyropoulos
  • Alton Etheridge
  • Nikita Sakhanenko
  • David Galas
چکیده

The use of RNA-seq as the preferred method for the discovery and validation of small RNA biomarkers has been hindered by high quantitative variability and biased sequence counts. In this paper we develop a statistical model for sequence counts that accounts for ligase bias and stochastic variation in sequence counts. This model implies a linear quadratic relation between the mean and variance of sequence counts. Using a large number of sequencing datasets, we demonstrate how one can use the generalized additive models for location, scale and shape (GAMLSS) distributional regression framework to calculate and apply empirical correction factors for ligase bias. Bias correction could remove more than 40% of the bias for miRNAs. Empirical bias correction factors appear to be nearly constant over at least one and up to four orders of magnitude of total RNA input and independent of sample composition. Using synthetic mixes of known composition, we show that the GAMLSS approach can analyze differential expression with greater accuracy, higher sensitivity and specificity than six existing algorithms (DESeq2, edgeR, EBSeq, limma, DSS, voom) for the analysis of small RNA-seq data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Statistical Study of two Diffusion Processes on Torus and Their Applications

Diffusion Processes such as Brownian motions and Ornstein-Uhlenbeck processes are the classes of stochastic processes that have been investigated by researchers in various disciplines including biological sciences. It is usually assumed that the outcomes of these processes are laid on the Euclidean spaces. However, some data in physical, chemical and biological phenomena indicate that they cann...

متن کامل

A Useful Family of Stochastic Processes for Modeling Shape Diffusions

 One of the new area of research emerging in the field of statistics is the shape analysis. Shape is defined as all the geometrical information of an object whose location, scale and orientation is not of interest. Diffusion in shape analysis can be studied via either perturbation of the key coordinates identifying the initial object or random evolution of the shape itself. Reviewing the f...

متن کامل

Relation Between RNA Sequences, Structures, and Shapes via Variation Networks

Background: RNA plays key role in many aspects of biological processes and its tertiary structure is critical for its biological function. RNA secondary structure represents various significant portions of RNA tertiary structure. Since the biological function of RNA is concluded indirectly from its primary structure, it would be important to analyze the relations between the RNA sequences and t...

متن کامل

A Multi-Objective Mixed-Model Assembly Line Sequencing Problem With Stochastic Operation Time

In today’s competitive market, those producers who can quickly adapt themselves todiverse demands of customers are successful. Therefore, in order to satisfy these demands of market, Mixed-model assembly line (MMAL) has an increasing growth in industry. A mixed-model assembly line (MMAL) is a type of production line in which varieties of products with common base characteristics are assembled o...

متن کامل

I-13: Transcriptome Dynamics of Human and Mouse Preimplantation Embryos Revealed by Single Cell RNA-Sequencing

Background: Mammalian preimplantation development is a complex process involving dramatic changes in the transcriptional architecture. However, it is still unclear about the crucial transcriptional network and key hub genes that regulate the proceeding of preimplantation embryos. Materials and Methods: Through single-cell RNAsequencing (RNA-seq) of both human and mouse preimplantation embryos, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 45  شماره 

صفحات  -

تاریخ انتشار 2017